Draft version January 20, 2013 

Preprint typeset using LAT^X style cmulatcapj v. 04/20/08 



ASSIMILATING DATA INTO AN aft DYNAMO MODEL 
OF THE SUN: A VARIATIONAL APPROACH 

Laurene Jouve 

Univcrsite de Toulouse; UPS-OMP; CNRS; IRAP; 14, avenue Edouard Belin, 31400 Toulouse, France 



Allan Sacha Brun 

Laboratoire AIM Paris-Saclay, CEA/IRFU Universite Paris-Diderot CNRS/INSU, 91191 Gif-Sur-Yvette, France 

AND 

Olivier Talagrand 

Laboratoire de meteorologie dynamique, UMR 8539, Ecole Normale Superieure, Paris Cedex 05, France 

Draft version January 20, 2013 

ABSTRACT 

We have developed a variational data assimilation technique for the Sun using a toy af2 dynamo 
model. The purpose of this work is to apply modern data assimilation techniques to solar data using 
a physically based model. This work represents the first step toward a complete variational model 
of solar magnetism. We derive the adjoint afl dynamo code and use a minimization procedure to 
invert the spatial dependence of key physical ingredients of the model. We find that the variational 
technique is very powerful and leads to encouraging results that will be applied to a more realistic 
model of the solar dynamo. 

Subject headings: The Sun: activity dynamo; Methods: data assimilation 



1. INTRODUCTION 
1.1. Predicting the solar activity 

At its surface, the Sun exhibits a turbulent and very 
active behavior, with magnetic phenomena as diverse as 
sunspot emergence, flares, prominences, coronal mass 
ejections (CMEs). Quite unexpectedly this magnetic 
activity is cyclic. The full 22-year cycle is composed 
of two consecutive 11-year sunspot cycles (producing 
the so-called butterfly diagram). Coexisting with these 
large-scale ordered magnetic structures are small-scale 
but intense magnetic fluctuations that emerge over much 
of the sol ar surface , with little regard for the solar 
cycle (see IStix 20021 ). It is currently thought, that, 
in order to explain this activity and the large diver- 
sity of observed magnetic phenomena, the Sun must 
operate two conceptu ally different d ynamos: a large- 
scale /cyclic dynam o (jMoffatt 19781 : iBrun et al. 20041 : 
Charbonncau 2005 ) and a turbulent small-sca le one (e.g., 
Cattaneo fe Hughes 200ll lOssendriiver 2003 ). 

This cyclic activity has been observed directly since 
the early 1600's and traced back (indirectly) via 10 Be 
concentration fou nd in ice core for at least 10,000 years 
(jBeer et al. 1998( 1. This intense activity is known to have 
a direct impact on the Earth's upper atmosphere and on 
our technological society. Being able to anticipate and 
predict the turbulent solar dynamics and magnetic activ- 
ity is thus crucial if we wish to prevent damages to our 
satellites or interferences in our communications. This 
has led to the development of space weather studies and 
forecast. Answering key questions such as which physical 
processes lead to eruptive phenomena, what is the asso- 
ciated spectrum of solar energetic particles (SEP) and 
what leads to geoeffective interplanetary coronal mass 
ejections (ICMEs) const itute the main purpose of space 
weather (Schwcn n 2006[ K 



Solar eruptive phenomena are associated with active 
regions, i.e complexes of sunspots, that possess intricate 
magnetic field topology. There is a direct link between 
internal magnetism and these surface magnetic phenom- 
ena, since active regions are related to the emergence of 
strong toroidal structures most likely generated in the 
deep solar tachocline of intense latitudinal and radial 
shear at the base of the convection zone (jCline 2003 : 
Browning et al. 20061 : IBrun et al. 20TlT ). These toroidal 
structures become unstable, subsequently rise through 
the solar con vection zone to appear at the surface as ac- 
tive regions (iMagara &; Longcope 2003t iFan et al. 2003 : 
lArchontis et al. 2005t (j ouve fe Brun 2 009) and are ad- 
vected by convective motions on the solar surface (Wang 
& Sheeley 1991). However, the exact link between the so- 
lar cycle, CMEs and the geoef fectiveness of solar events i s 
not straightforward to assess (|Pevtsov fc Canfield 200lT) . 
It is however clear that one important goal of space 
weather is to characterize the configurations (strength, 
location, field topology, etc..) that lead to geoeffective 
events. One way to progress in our ability to predict so- 
lar activity is to assimilate quality observations in mod- 
ern numerical models of solar inner and outer magnetism 
(Schrijver & Derosa 2003). 

Hathaway et al. (1999) summarize most of the meth- 
ods used to predict the next solar cycle using historical 
data. Methods such as regression or curve fitting work 
well near solar maximum while others such as geomag- 
netic precursors perform better near minimum. It has 
also been empirically determined that odd numbered cy- 
cles are usually stronger than even numbered ones (pos- 
sibly indicating a preferred orientation of the inner solar 
magnetic field) and that on average the cycle rises in 4.8 
years and falls in 6.2 years, even though strong cycles rise 
faster to their maximum. A useful quantity to assess the 
intensity of a cycle is the yearly averaged Wolf sunspot 
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number: 

R = k(10g + s) 

with g the number of sunspot groups, s the total number 
of individual sunspots in all groups and k a variable scal- 
ing factor (with usually k < 1) that accounts for instru- 
ments or observation conditions. Hathaway et al. (1999) 
suggest that a synthesis of current methods can provide a 
more accurate and useful forecast of the evolution of the 
Wolf number. Cycle 23 was predicted by the solar cycle 

23 panel to be slightly stronger {R ~ 160) than cycle 22. 
However with an observed value of about 120, it turned 
out to be almost as weak as the even numbered cycle 20 
(R = 105.9 in 1968). Further, in the prediction summary 
of the solar cycle 23 panel, only few of the many predic- 
tions (even by taking into account their error bars) , were 
actually including the observed value of 120. 

One thus needs to be careful with the standard indica- 
tors used up to now. The existence of a panel prediction 
can be seen a s an attempt to use ensemble forecasting 
(|Kalnav 20031) . similar to what is done in meteorology. 
The relative success of these methods, in particular for 
cycles 21 and 22 (much less so for cycle 23) could be a 
sign that the set of model equations used in the panel 
form a good ensemble. However most of the techniques 
considered by Hathaway et al. do not resolve the spa- 
tial dependence of the solar activity, they just focus on 
global properties such as number of sunspots or the tim- 
ing of the next maximum. As such, these techniques are 
much less sophisticated than the ones used in weather 
forecasting. We thus need to develop more physically 
based forecast models of the solar cycle. Historically two 
types of physical models have been developed in order 
to understand the solar global dynamo: 2-D mean field 
model s and 3-D magnet ohydrodynamic (MHD) simula- 
tions (|Ossendriiver 2003h . However none of these models 
were used, up to very recently, to predict the evolution of 
the solar cycle. In order to take into account the spatial 
dependency of the solar activity, more recent approaches 
solve numerically the induction equation in a meridional 
plane and impose through a s urface term the observed 
latitudinal band of activity (|Dikpati fc Gilman 20061 : 
iCameron fc Schussler 20071: iNandv et al. 20111) . Bv~~as- 
similating sunspot or meridional flow data, they try to 
predict the peak and timing of cycle 24. 

Today, the predictions for the current solar cycle (re- 
cently summarized by the cycle 2^ prediction panel) 
differ quite significantly from one model to another 
(Hathaway 2010). Some techniques, such as the ones 
based on geomagn etic precursors, predict a weak cy- 
cle 24 (R < 100. iSvalgaard et al. 2001 iDuhau 20031: 
IdeJager fc Duhau 20091 ). others based on dynamo mod- 
els or meridional flow speed predict a stronger cycle ( R > 
140, iDikpati et al. 20061 : iHathawav fc Wilson 20041 ). It 
is worth noting that all the predictions for a weak cycle 

24 rely on cycle 23, i.e cycle n is correlated with cy- 
cle n — 1, whereas those predicting a strong cycle 24 
(i.e stronger than cycle 23) favour a correlation with cy- 
cle 22, i.e cycle n is well correlated with cycle n — 2. 
The predictions of the cycle 24 panel also differ on the 
timing of the next maximum. In 2008, the predictions 
were that the maximum would occur between 2010 and 
2012, depending on how fast the next cycle would rise 
to reach its maximum (fast if strong, slow if weak). It 



is now clear that the maximum will be reached late in 
2013 or in 2014, confirming again the difficulty to pre- 
dict the solar cycle. Some recent efforts have been un- 
dertaken to improve this situation. Kitiashvili & Koso- 
vichev (2008) for instance have used assimilation of data 
in solar dynamo m odels to predict the solar activity 
(see also the work of iChoudhuri et al. 2007t iR oth 20091: 
iRempel fc Dikpati 2009D ~ 

Assimilation of solar data i n numerical mod- 
els has thus already start ed (IDikpati et al. 2004 ; 
i Kitiashvili fc Kosovichev 20081: iBelanger et al. 2005 : 
ISchriiver fc DeRosa 200l ~ However, intrinsic difficul- 
ties in the solar weather forecast are linked to the fact 
that we do not have yet a complete comprehension of 
the solar magnetic dynamo, cycle and surface activity. 
For every "piece" constituting the full puzzle, theoretical 
developments are still underway. This work intends to 
contribute to this effort. 

1.2. Modern data assimilation techniques in weather 
forecasting 

In meteorological centers, data assimilation has been 
operational for many decades already. Various ap- 
proaches have been developed, becoming more and more 
sophisticated. Data assimilation can be defined as "using 
all available information, to determine as accurately as 
possible the stat e of the atmospheric (or oceanic) flow" 
ijTalagrand 1991 . The purpose of the work presented in 
this paper is to add the words 'solar flow and activity' at 
the end of the quote. 

Modern data assimilation techniques rely on statistical 
estimation theory, such as least squares methods. The 
generalization of such statistical methods to multivariate 
systems, leads to what is c alled the optim al interpolation 
(OI) for data assimilation (jLorenc 19810 . Optimal inter- 
polation consists in taking into account (assimilating) the 
new information that the observational data provide in 
order to advance in time the "background" state (also 
called first guess or prior information) that the weather 
forecasting numerical code has predicted. The increment 
is obtained by taking the difference, or innovation, be- 
tween the observational data and the observation opera- 
tor. The new state or analysis is then the result of the 
assimilation/forecast procedure. More specifically, let x b 
be the background vector state characterizing the current 
state of the model, H the observational operator and y° 
the observational data to be assimilated in the model, 
then one can show that the analysis x a is: 

x a = x b + W(y° - J ff(x b )), (1) 

where y° = H (x real ) + error and where W represents 
the weights determined from the estimated statistical 
error covarian ces of the forecast and the observations 
(|Kalnav 20031) . This equation is the base of modern data 
assimilation. The various assimilation methods will differ 
in the exact definition of W. 

In practice, the background state, the observations and 
even the numerical model used to simulate the Earth's 
atmosphere (i.e the primitive equations), possess errors. 
The assimilation methods consist in predicting the evolu- 
tion of the errors and of course of minimizing it, i.e keep- 
ing it under control as much as possible given the very 
chaotic nature of the Earth's atmosphere. Errors in the 
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Fig. 1. — Schematic representation of the sequential and 4D varia- 
tional data assimilation methods used in weather forecast (adapted 
from Bocquct, 2011). Upper panel: in the sequential method, the 
background state (x b ) is updated every time observations are avail- 
able (time between k and k+1) and the model evolves the state 
until the next step (following the arrows), at which observational 
data (y°) are again assimilated to produce the analysis (x a ). Lower 
panel: 4D variational method and comparison with sequential as- 
similation. In the 4D variational method, within a time interval 
the model and the observations are taken into account in the cost 
function J that needs to be "minimized". The minimization of 
this cost function results in a best trajectory (plain arrows) across 
the observations. 

dynamical atmospheric system are known to double ev- 
ery two to three days, which leads to a predictability limit 
for weather forecasting that Lorenz in 1963 was the first 
to quantify to be of the order of 15 days. This is a very 
strong constraint on our ability to predict weather pat- 
terns and solar equivalent predictability limits must ex- 
ist. However some atmospheric properties may be easier 
to predict over long periods than others, such as weekly 
averaged rainfall or temperature. It is likely that for the 
Sun, some characteristics could also be predicted over a 
longer period of time. 

In order to have a better control of the evolution of the 
errors, data assimilation methods were developed and 
split into two categories: sequential or v ariational (see 
ITalagrand 19971 : iDaley 199lHKalnay 20031 ). As shown in 
FigJtl in the sequential methods, such as 01 or Kalman 
filter, observational data are assimilated in the numerical 
model at fixed time, say every 6 hours, and then evolved 
forward in time. In the so called 4-D variational tech- 
niques, one seeks to minimize a cost (or misfit) function 



J{tC) (representing the misfit between the observations 
and the outputs of the model) within a certain time in- 
terval (usually 12 hours) for which data are available be- 
fore making a forecast. The procedure converges when 
J reaches its m inimum which occurs for £ = x a (see 
ITalagrand 20031 ). Then in the next 12-hour periods, the 
procedure is applied again, using as background state 
the numerical model of the previous 12 hours. The lat- 
ter technique is the one we wish to apply to the solar 
dynamo problem. 

1.3. Variational assimilation and the adjoint method 

Variational methods require the development 
and maintenance of a so-called adjoint model 
of the dynamical equatio ns under consideration 
(|Le Dimet fc Talagrand 19861) . This adjoint model 
computes efficiently the gradient dJ/d£ necessary to the 
iterative minimizing procedure, by evolving backward 
the adjoint system o f equations fro m the forwar d 
temporal integration (Talagra nd 20031 : iKalnav 2 003). 
Such a method is for instance also useful if one seeks 
to determine the gradient of a variable with respect to 
a large set of input variables. One can also evaluate 
the sensitivity of an erroneously predicted feature in 
the flow in order to assess which input variables are 
responsible for the error. 

Developing an adjoint model is a straightforward but 
costly task and no such models have been yet developed 
for the full MHD system of equations (and in particular 
the induction equation for the magnetic field, see next 
sections) that is required to model the solar dynamics 
and magnetic activity. The development of the adjoint 
model of the induction equation is one purpose of this 
work. 

Let us now enter a little bit more into the details of 
the adjoint procedure in order to understand how it eases 
the evaluation of the gradient of the cos t function J wit h 
respect to all the input parameters (see [Tal agrand 19911 ). 

We start by considering a composition of operations 
G = Gi Gi-i ... G 2 Gi (where G is a differ- 
entiable function) that, given a set of input variables 
u = (ui, U2, U3, u n -i, u n ), determines a set of output 
variables v = (vi,V2,Vs,—.,v m -i,v m ). 

This process can be described by the following equation 



G(u) 



(2) 



A variation Sv on the output data leads to a variation 
Su of the input data that is given at first order by the 
tangent linear equation: 



Sv = G'Su 

where G' if the local Jacobian matrix of G, i.e. 



G' = 



dui 



(3) 



(4) 



l<j<m,l<i<n 



Let us now consider a scalar cost function J % function 
of the output variables v. The gradient of the function 
J with respect to the input variables u reads: 



dj 
dm 



^ dvj dj 

^ dui dvj 
3=1 3 



with i = l, 



..,71 



(5) 
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which is in matrix notation 

V U J = G'*V V J (6) 

where G'* corresponds to the transposition of G' (hence 
the operator represented by the matrix G'* is the adjoint 
operator of the one represented by G'). 

The adjoint method thus allows to compute the gradi- 
ent of J with respect to the input variables by consider- 
ing the above expression (see appendix for more details). 
Note that since G is the composition of elementary pro- 
cess (Gk)k=i,...,h the transpose of the Jacobian matrix 
G' will be the product of the transposes of the individ- 
ual Jacobian matrices G' k , taken in the reversed order: 

G'* = G'f x G" 2 * x x G*;* (7) 

We have chosen to use this method in the framework 
of the solar dynamo, by applying it first to a simple ail 
mean field dynamo model in Cartesian geometry. We 
give more details in the Appendix on how to apply it 
specifically to the induction equation and now describe 
the model used in this work. 

2. THE SPECIAL CASE OF THE aQ DYNAMO 
2.1. Direct afl dynamo model 

The equation we are interested in is the mean-field in- 
duction equation, derived from the standard induction 
equation governing the evolution of a magnetic field in 
the presence of a conducting fluid and dissipation, in 
the framework of mean-field theory. The details of the 
derivation of this equation can be found in Steenbeck & 
Raedler (1966) or Krause & Raedler (1980). The mean- 
field equation reads: 

— = V x (v x B) + V x (aB) - V x (? ? VxB) (8) 
at 

where B and v are respectively the mean magnetic and 
velocity fields, a parametrizes the physical process re- 
sponsible for the regeneration of poloidal field and r\ is 
the effective magnetic diffusivity. 

We choose to work in Cartesian geometry with co- 
ordinates (x,y,z), which would respectively correspond 
in spherical geometry to the radius, latitude and longi- 
tude. The 3 components of the magnetic field depend 
only on the x and y coordinates. The domain is de- 
fined as [aii, £2] x [yiiVi], with a regular grid spacing 
assuming N x = N y = 30. For simplicity we assume that 
,ti,2 = ±1 and 2/1,2 = ±1. We note that the discretization 
in y is symmetric with respect to the equator defined by 
y = 0. The poloidal/ toroidal decomposition of the mag- 
netic field then reads: 

B{x,y,t) = V x (A(x,y,t)e z ) + B z (x,y,t)e z (9) 

Reinjecting this poloidal/toroidal decomposition in our 
mean-field induction equation, we get two coupled partial 
differential equations, one for the poloidal potential A 
and the other for the toroidal field B z . 

dA .d 2 A d 2 A. 

at ssaB ' + r '^ + W ) (10) 

8B Z _ dv dA dvdA d 2 B z d 2 B z 
dt dx dy dy dx dx 2 dy 2 



We choose to neglect the a-effect in the equation for 
the toroidal field since the shear is considered to be the 
dominating source term. We thus consider a simple ctVL 
dynamo model here. For boundary conditions, we as- 
sume for simplicity that both A and B z are set to zero 
on the borders x = X\ or xi for all y and on the borders 
y = yi or 2/2 for all x at all times t. As initial conditions, 
we choose a dipolar field structure, A being symmetric 
with respect to the equator y = and B z being zero 
everywhere. 

The prescribed velocity field simply expresses as 

y + 1 

v = Hqx sin(7r — - — )e z (12) 

where Qo represents the rotation rate of our domain. 

We now need to give the expression for the a-effect, 
responsible for the regeneration of poloidal field. We 
choose it to be antisymmetric with respect to the equa- 
tor, as is assumed in the Sun from surface kinetic hclicity 
measurements (Komm et al. 2007, 2008) and 3D sim- 
ulations of the c onvective interior (Mie sch et al. 20011 
iBrun et al. 20041) . Its expression is the following 

a = a cos(7r ^ ^ - ) (13) 

Finally, the magnetic diffusivity is assumed to be con- 
stant r\ = est. The profile of the physical ingredients of 
the model are shown in Fig. [2j 
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Fig. 2. — Profiles of v (upper panel) and a (lower panel) used in 
this simple model. 
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We can now nondimensionalize those equations by 
choosing a length scale L and a temporal scale L 2 jr\. 
This procedure leads to the definition of physically rele- 
vant dimensionless parameters and to the new equations: 



d 2 A, 



dt a z dx 2 dy 2 



dB z _ c dvdA _ dvdA + d 2 B z 
dt dx dy dy dx dx 2 



d 2 B z 
dy 2 



(14) 



) (15) 



with C a = ctoL/i] and Co = rtoL 2 /rj the Reynolds num- 
bers measuring the intensity of the a and f2 effects com- 
pared to the Ohmic dissipation. The product of those 
two numbers will have to be above a given threshold for 
dynamo action to occur. 

2.2. The numerical method and choice of model 
parameters 

Equations [14] and [15] are solved numerically using a 
finite difference scheme in space and time. More specifi- 
cally, we use a first order explicit Euler scheme for time 
integration and a 2nd order centered scheme in space. We 
thus have to carefully check the CFL condition: the time 
step will be constrained by the minimum of the advective 
timescales (related to ao and flo) and the diffusive time 
(related to 77). The output of such simulations will be 
two 3D arrays A and B z (two dimensions in space and 
one in time) of dimension 30 x 30 x 1000. 

A typical dynamo solution found in our model is shown 
in Fig. [3] Our set of parameters (ao = —0.02665, 
Oo = 0.03 and 77 = 0.001) was carefully chosen so that we 
are in the marginally stable regime. We are exactly at the 
threshold for which the dynamo instability is triggered, 
i.e. the growth rate of the instability is purely imaginary 
and the fields oscillate around zero without growing. If 
the absolute value of the dynamo numbers were further 
increased, the dynamo instability would grow and in this 
linear case, the magnetic energy would increase exponen- 
tially without bound. 

The lower panel of Fig[3] shows the butterfly diagram, 
i.e. a time-latitude cut of the toroidal field B z at a par- 
ticular location in depth. Again, our choice of param- 
eters, especially the sign of the dynamo number C a Cn 
was made to produce an equatorward propagating dy- 
namo wave. Indeed, Yoshimura (1975) showed that the 
direction of propagation of the dynamo wave when a ra- 
dial shear is present depends on the sign of the product 

2.3. Generating observational data 

The idea of this work is to show that data assimila- 
tion techniques can be applied to solar dynamo models. 
To do so, we develop the adjoint model necessary for 
the variational assimilation described in Section 1 and 
we test its validity. We will generate synthetic observa- 
tions with a certain set of parameters and will use our 
adjoint model to minimize the cost function and recover 
the right parameters starting from a random initial guess. 
Such a procedure is called a twin experiment and has 
been used in various situations and studies before (e.g. 
iFournier et al.. 20071 ). 
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Fig. 3. — Representative case for 00 = —0.02665, f2o = 0.03 
and rj = 0.001: time evolution of the toroidal field (plain line) and 
poloidal potential dotted line at a particular point in space (upper 
panel) and time-latitude cut of the toroidal field at a depth x p 
near the top of the domain (lower panel). The latter represents 
the butterfly diagram of our solution. The dashed and dashcd- 
dotted lines represent respectively the end of the first and second 
assimilation window. 

We choose as our synthetic data the dynamo solution 
presented in the previous section. In our twin experi- 
ments, the observations are chosen to be the toroidal field 
B z at ny specific points in space and nt points in time, 
corresponding in the Sun to the value of the sunspots 
magnetic field at different latitudes and time during the 
cycle. 

The aim of the adjoint procedure will then be to re- 
construct the state vector a, the dimension of which is 
the number of points in the y-direction, fixed to 30 in all 
calculations. In the remaining of the paper, we distin- 
guish the true physical ingredient (denoted a) and the 
state vector to be reconstructed (denoted a). 

2.4. Adjoint afl dynamo model 

In the appendix, we present the derivation of the con- 
tinuous adjoint induction equation. This helps us gaining 
some insight on the relation between the mathematical 
definition of an adjoint operator and the procedure we 
are using in this work. However, it has to be pointed 
out that it is not the adjoint partial differential equation 
which will be discretized to build the adjoint code. To 
do so, we rather attribute an adjoint instruction to each 
direct instruction in the tangent linear model deduced 
from the linearization of the direc t model. This fo llows 
the formal procedure described in (]Talagrand 19911 ) and 
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(jGiering fc Kaminski 19981) . 

The goal of the whole variational experiment here is to 
minimize a cost function J which will measure the misfit 
between the observations and the values of the variables 
calculated by the numerical model. To do so, we need 
first to define a proper cost function which will have to be 
minimized. Secondly, the idea is to choose a minimiza- 
tion algorithm which uses the values of the cost function 
(calculated by the direct integration of the model) and its 
gradient with respect to all input parameters (produced 
by the adjoint integration). 

For our studies, we choose the following cost function 



J 



nt ny 

EE 

k=i j=i 



(B z (x p , yj ,t k ) - Bf s {x p ,y 3 ,t k )) 
u,(j, fc) 2 



(16) 



where x p is a particular depth. It is chosen to be close to 
the boundary of the domain in our case, in the attempt 
to get closer to the real Sun where data are available only 
at the surface. uj{j,k) can be adjusted to give more or 
less weights to some observations, if for example some are 
more reliable than others. This would happen if a new in- 
strument with more accuracy was launched (then we can 
expect the errors on the observations to vary in time) or 
if observations of certain regions in space were less sub- 
ject to uncertainty. In our twin experiments described 
below, ui(J, k) is chosen to be constant, i.e. independent 
on the position in space or time. 

The cost function is then minimized through a quasi- 
Newton method which uses the first and second deriva- 
tives of the function. A particularity of the quasi-Newton 
methods is that they need the gradient of the function 
(which is here provided by the adjoint integration) but 
do not require exact computation of the Hessian matrix, 
which is instead approximated by an iterative algorithm 
(here the formula of Broyden-Fletcher-Goldfarb-Shanno 
is used to update the value of the Hessian approxima- 
tion). See Polak (1971) for details about the algorithm. 
We note here that the computation of the gradient of the 
cost function through the adjoint code has been tested. 
To do so, we checked that the quantity 



J(X + SX) - J{X) - SX ■ VJ(X) 



(17) 



with V J{X) calculated through the adjoint code, is or- 
der o(6X) to computer accuracy. 

3. TWIN EXPERIMENTS AND RESULTS 

As discussed in Sect l2.31 we wish to reconstruct the 
true state a. To do so, we perform several experiments 
to assess the sensitivity and quality of the reconstructed 
state. 

3.1. Regular sampling in space 

Our first experiment consists in producing data with 
the choice of parameters quoted above at regularly- 
spaced locations in space and time. More specifically, 
we fix the value for the x-coordinate (representing the 
depth in the convection zone) and we produce observa- 
tions both in the Northern and Southern hemispheres, 
with a regular spacing. Moreover, those observations will 
be available during the first cycle(s) only, with a regular 
spacing in time. 



The initial guess is a — on every grid points except 
at the boundaries y = — 1 and y = 1 where a is set to 
the true state. Indeed, the values of a at the bound- 
aries will not affect our cost function since the magnetic 
field B z is exactly set to zero at those points (see lower 
panel of Fig|3]and EqsflOland lTT]) . As a consequence, in 
the minimization procedure, only a within the domain 
is adjusted to reduce the amplitude of the cost function. 
The tolerance on the gradient is set to 1CP 12 , which is 
typically reached after about 300 iterations of our min- 
imization algorithm. By that time, depending on the 
number of observations used, the final value of the cost 
function varies between 10~ 17 and 10 -27 , i.e. has de- 
creased by at least 16 orders of magnitude. We note here 
that the number of iterations might seem large compared 
to the dimension of the state vector. However, close to 
both the boundaries and the equator, the amplitude of 
the toroidal field B z is about 100 times smaller than at 
mid-latitude. Since the a function only affects the cost 
function through its product with B z (see Eqs. [TO] and 
[TT]) . the recovery of a will be less efficient in the regions 
where B z is close to zero. If, on the contrary, those points 
are removed from the assimilation procedure and initially 
set to their true values, the convergence is much faster 
(not shown). We will discuss the difficulties of recovering 
a in the equatorial regions in the following sections. 



Reconstructed alpha 

Initial alpha 

10'*(Reconstrueted-True) 




Fig. 4. — Initial guess and a recovered by the minimization of the 
cost function with 10 observations in time and 10 regularly spaced 
observations along the y-direction for each of those 10 points in 
time. We also show the error between the recovered a and the true 
state, magnified by a factor 10 7 . 

We run our minimization procedure and compare the 
results obtained when various numbers of observations 
are assimilated. The number of points in time can be 5 
or 10, located in the first or first two cycles (see the 2 as- 
similation windows in Figj3j). In space (more specifically 
in the direction of y, representing the latitude), the num- 
ber of observations varies from 6 to 14. The total number 
of observations thus extends from 30 to 140 depending 
on the calculation. Figure [4] shows a typical result of the 
minimization algorithm for 100 assimilated observations. 
The function is perfectly recovered and the pointwise er- 
ror has been reduced by a factor 10 7 compared to the 
initial guess. 

The first conclusion which can be drawn from this ex- 
periment is that, as must necessarily be, increasing the 
number of observations decreases the error made on the 
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Fig. 5. — L2 error on a (compared to the a used to produce 
the observations) for various numbers of observations in space and 
time. Note the monotonous decrease in the error when 10 points 
in time are used as observations. 

reconstructed a (see FigJS]). However, even 30 observa- 
tions in total (5 in time times 6 in space) are sufficient to 
get an a function indistinguishable from the true state. 
The only quantitative way to compare the different ex- 
periments is thus to look at the L2 errors between the 
a coming from the minimization algorithm and the true 
a used to produced the observations. More precisely, we 
calculate the following quantity 



Eny 
3=1 



(&(yj) - a(yj)Y 



(18) 



The amplitude of those errors are shown in Fig. [SI as a 
function of the number of observations in the y-direction. 
For completeness, we show the results obtained when ob- 
servations are located both in the first cycle (nt = 5) and 
in the first two cycles (nt = 10). We clearly show here 
that the error almost monotonically drops when more 
and more observations are assimilated, reaching values of 
the order of 10 -7 for the best cases. The reconstructed a 
then produces poloidal and toroidal magnetic fields very 
much in agreement with our synthetic observations, as 
shown in Fig. [6l 

Figure |6] shows the Li errors, on the toroidal and 
poloidal fields produced by the reconstructed a effect, 
for various numbers of assimilated observations. Again, 
we find a very good agreement both for the poloidal and 
toroidal fields even for the smallest number of observa- 
tions. For larger numbers of observations, the relative 
errors reach values close to 10~ 10 and even 10 -12 for the 
toroidal field. It is interesting to note that the errors 
on the toroidal field are systematically about one order 
of magnitude less than the errors on the poloidal field. 
This is likely due to the fact that observations are avail- 
able on the toroidal field only (e.g. the cost function 
depends exclusively on Bz) and thus a better agreement 
is to be expected. We can also note on this figure that 
the errors do not grow in time and thus that the func- 
tions are recovered on the whole time interval, even if 
observations were only available on the first cycles. This 
feature is mainly due to the fact our system of equations 
is stable to perturbations of the initial conditions, mean- 



FlG. 6. — 1/2 errors on the toroidal (black lines) and poloidal 
fields (red lines) for the different experiments. 

ing that an initial perturbation would not be amplified 
nor damped. 




Fig. 7. — Gradient of J with respect to a in the case where ob- 
servations (represented by the squares at the bottom of the graph) 
are regularly spaced in y. The various curves represent the value 
of the gradient after 30 iterations of the minimization algorithm, 
50 X WJ after 100 iterations, 500 X WJ after 160 iterations and 
10000 X VJ after 200 iterations. 

For the best case considered (nt = 10, ny = 14), we 
found it instructive to follow the evolution of the gra- 
dient of the cost function with respect to a during the 
minimization procedure. We choose particular steps in 
the iterative minimization procedure, separated by suffi- 
ciently large decreases of the norm of the gradient. The 
results are shown in FigJT] where the gradient is plotted 
at those steps, with respect to the y-coordinate. The first 
thing we note is the clear decrease in the amplitude from 
the beginning to the end of the procedure, the last step 
chosen (after 200 iterations) being very close to the to- 
tal number of iterations required to achieve convergence 
(211 in this case). The second striking property of the 
curves shown on this figure is the shape of the function, 
antisymmetric with respect to the equator. This charac- 
teristic indicates that the cost function J is not sensitive 
to the values of a close to the equator and explains why 
the difficulties to reproduce the true a-effect lie mostly 
in the equatorial regions. This will be even more obvious 
in the following sections where data are chosen not to 
be distributed over the whole domain or when data are 
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perturbed by a random noise. However, the profile of the 
gradient is not surprising if we consider Eq. IB 1 61 of Ap- 
pendix [B] and Eqs. [TOl and fTTl that clearly demonstrate 
that if B z is zero, a has no influence in the equation for 
the magnetic field. Stated otherwise, the profile of V ' a J 
follows that of the mean value of B z over the time inter- 
val in which the assimilation procedure is applied. As a 
test, we plotted < B z (x, y) >t with respect to y at a par- 
ticular point in x (not shown) and indeed, we recovered 
the exact same profile as what is shown in Fig|7]for the 
various curves. 

3.2. Irregular sampling in space 

We chose here as observations a quantity B z related to 
the intensity of the sunspots magnetic field. In the real 
Sun, sunspots emerge at mid-latitudes at the beginning 
of the magnetic cycle and closer and closer to the equator 
as the cycle proceeds. For a more realistic experimental 
setting, we have studied different cases for which we have 
assimilated observations in restricted latitudinal bands. 
We first show the results of an experiment where data 
were available in one hemisphere only and in the next 
section, we investigate the case where observations are 
assimilated in the activity belt only, i.e. at low latitudes 
in both hemispheres. 

3.2.1. One hemisphere only sampling 

In this first case, we produce synthetic data only in 
the Southern hemisphere (for negative values of y) and 
study the reconstruction of the a function through the 
minimization algorithm. Again, the initial guess is ev- 
erywhere except on the boundaries and observations are 
equally spaced in time and on the first two cycles only 
(10 points in time are used here). 

Figure |8] shows the results of the minimization. It is 
clear that where data have been assimilated, the recon- 
struction of the function is much more accurate than on 
the Northern hemisphere where observations were ab- 
sent. The behavior of the function is much smoother in 
the Southern hemisphere and very similar for both sets of 
observations. On the contrary, the function strongly fluc- 
tuates on the data-free region and especially in the equa- 
torial region for the experiment where only 10 points in 
space were used. However, when observations are added 
mostly close to the equatorial region, the error is reduced 
even on the data-free region and the equatorial region is 
almost correctly recovered. 

However, even if there exists a clear asymmetry be- 
tween the two hemispheres here, it has to be noted that 
the error on the a function after minimization is much 
less than the initial error, even in the data-free region. 
This is shown on the lower panel of Fig. where the 
pointwise error is plotted for the initial guess and for the 
recovered d. We thus conclude from those experiments 
that a knowledge of the toroidal field only in one hemi- 
sphere also gives us some information on the profile of 
the a-effect in the other hemisphere. This result shows 
that a link exists between the two hemispheres, due to 
various physical processes, explaining why the intensity 
of the magnetic field in one hemisphere will influence the 
other hemisphere. In the Sun, this link could be related 
to magnetic flux crossing the equator at particular mo- 
ments during the cycle or to the dipolar topology of the 
poloidal field. 



0.0-1 




-0.04 1 .... i .... i .... i ... . 

-1.0 -0.5 0.0 0.5 1.0 

y 



0.03 




-0.03 E. 



-1.0 -0.5 0.0 0.5 1.0 



Fig. 8. — Upper panel: a reconstructed after assimilation of 
data in the Southern hemisphere only, with 2 different sets of ob- 
servations, superimposed with the true state. Lower panel: Errors 
made on the reconstructed a for the initial guess (see Fig. [2} and 
after assimilation of the 2 sets of observations. 
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Fig. 9. — Gradient of J with respect to a in the case where ob- 
servations (represented by the filled squares at the bottom of the 
graph) are available only in one hemisphere. The various curves 
represent the value of the gradient after 30 iterations of the mini- 
mization algorithm, 80 X X7J after 100 iterations, 2000 X S7J after 
400 iterations and 20000 X VJ after 990 iterations. 

Again, as in the previous section, we have followed 
the evolution of the gradient of the cost function with 
respect to d. Various instants in the minimization al- 
gorithm were chosen, namely after 30, 100, 400 and 990 
iterations (the larger number of iterations being due to 
the slower convergence of the algorithm). At the be- 
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ginning of the minimization procedure, an asymmetry 
between the two hemispheres is clearly visible, as can be 
expected. This is seen in the analysis of the full curve of 
Figl9l which represents the gradient after 30 iterations 
of the algorithm. The peak value of the gradient in the 
Southern hemisphere is here about 3 times higher than 
the peak value in the Northern hemisphere. However, as 
the minimization proceeds, the gradient in the Southern 
hemisphere is reduced more than in the Northern hemi- 
sphere, leading to a more and more symmetric profile 
with respect to the equator. 



toroidal field only. We should note that the agreement 
for the poloidal field on the Southern hemisphere is very 
satisfactory, stressing the efficiency of the variational as- 
similation. 

3.2.2. Active latitude band sampling 

If we choose as observations the sunspot magnetic field 
detected during solar cycles, we have to be aware that ob- 
servations will mainly be available in the solar activity 
belt, i.e. between about —35° and 35° in latitude (Hath- 
away 2010). 
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Fig. 10. — Difference between the components of the toroidal 
magnetic field (upper panel) and poloidal potential (lower panel) 
produced by the reconstructed a and the true state at t=0.5 (in 
the middle of the time interval). 

Once again, we can analyze the quality of the magnetic 
fields produced by the reconstructed a and calculate its 
errors compared to the true state. This is shown in Fig. 
[TUlat one instant, for the case where 8 observations were 
used. We wish here to focus on the errors at a particular 
instant in the simulation since we are interested in the 
spatial distribution of the error, rather than on its time 
evolution. We show on this figure that again the field is 
in very good agreement with the true state in the region 
where observations were assimilated, the relative errors 
reaching values as low as 10~ 6 in these regions. On the 
contrary, the agreement in the Northern hemisphere is 
much worse, even if the relative error is of the order of 
10~ 3 for the toroidal field. For the poloidal field, the 
errors are again almost one order of magnitude higher, 
still due to the fact that observations are available on the 



Fig. 11. — Errors made on the reconstructed a after assimilation 
of various numbers of observations located in the equatorial regions, 
between —35° and 35° for the broadest interval. 

We thus choose to investigate the recovery of our true 
state in a case where data are assimilated close to the 
equator only. 

Figure [TT] shows the results of various experiments 
where data have been assimilated in a more or less nar- 
row band around the equator. We present cases where 
observations have been produced successively between 
-18° and 18°, -26° and 26° and -35° and 35°. Figure 
fTTI shows the difference of each reconstructed a to the 
true state. It is quite clear again that the recovery of 
the correct a is optimal at the locations where observa- 
tions were present. Indeed, the function is very smooth 
and close to the true state at low latitudes for the first 
two runs. Close to the poles, the fluctuations around the 
true a can be quite significant, the error being there of 
the same order as the function itself for the first run. 
However, when the area spanned by the observations in- 
creases, the agreement with the true state improves and 
when observations between about —35° and 35° in lat- 
itude are used, the relative error made on a is as low 
as I0~ 7 . This is an interesting property since the ac- 
tual activity band in the Sun is approximately located 
within those latitudes. We note in this particular case 
that the errors are of the same order everywhere in the 
domain and that the difference of knowledge/information 
between the region where data were available and the 
poles is mostly absent. We conclude here that the whole 
function has been recovered to a very good accuracy for 
this case where data were assimilated only in the activity 
belt. 

Once again, we can check the results on the magnetic 
fields produced by the recovered a. Results are shown in 
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Fig. 12.— Same as Fig. [lO]but for a case where data are assim- 
ilated in the equatorial region only between —35° and 35°. 

Fig. Q21 We chose to show the results for the assimilation 
on the latitudinal band —35° to 35° since the resulting 
a function for this case was recovered to a very good 
and similar accuracy in the whole domain. The largest 
errors both on the poloidal and toroidal fields at one in- 
stant are again located mainly in the data-free regions. 
Nevertheless, we note that their amplitude remains very 
small, even in the polar regions. Again, the errors on the 
poloidal field (for which we do not produce observations) 
are about one order of magnitude larger than those on 
the toroidal field. It has to be noted here that the differ- 
ence of knowledge/information between the equator and 
the poles is visible, contrary to what we found for the 
recovered d, stressing the not so direct correspondence 
between the a-effect and the magnetic field evolution. 
The recovery within the equatorial band is excellent, the 
error reaching values close to 10 -12 for the toroidal field 
and 10 -11 for the poloidal field. 

3.3. Additional noise on the observed data 

In reality, the assimilated observations will always be 
contaminated by errors. Hence, it is natural to study the 
behavior of our assimilation technique when observations 
depart significantly from what is directly produced by 
the numerical model. To do so, we produce the same 
synthetic data by running the direct code once with the 
choice of parameters quoted in sect !2.2l We then add 
noise on the data by calculating 

B z ^ se = B? s *(l + *r) (19) 



r being a random number between —1 and 1 and a mea- 
suring the departure from the synthetic data produced 
by the direct code. 
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Fig. 13. — True state (smooth plain line) and error introduced in 
the data (magnified by a factor 50, fluctuating line) which will be 
used for the assimilation. This is a special case where the synthetic 
data have been perturbed by a noise with a standard deviation of 
a = 10- 2 . 

As an illustration, we show in Fig. [13] the time evo- 
lution of the "true" toroidal field at a specific point in 
space. We superimpose the error on B z °^ ise and the true 
state for a = 10 -2 , magnified by a factor 50. As a direct 
consequence of Eq. [HO the noise is proportional to the 
value of B z and thus the errors are higher at periods of 
maximal activity. 

The results of the assimilation procedure are shown in 
Fig. HH The number of observations used here was 100 
(10 in time multiplied by 10 in the y-direction). With 
the unperturbed synthetic observations, the assimilation 
of those particular observations gave us an L2-error on a 
of about 6 x 10 -8 and between 10 -11 and 10~ 12 for the 
magnetic fields (see figures [5] and [6]). We will thus be able 
to directly compare the results of the minimization after 
assimilation of perturbed and unperturbed data. Four 
different experiments were investigated, three of which 
are represented in Fig. [TU The only difference between 
those various experiments is the coefficient of the obser- 
vation error a. 

From the figure, it is clear that the minimization of 
the cost function gives a profiles which agree less and 
less with the true state when the noise on the assimi- 
lated data is increased. More precisely, when a = 10~ 5 , 
the a function is almost perfectly recovered, except from 
a small region around the equator in which the cost func- 
tion is less sensitive to the values of a. When a = 10~ 4 , 
the result of the minimization procedure gives an & which 
is already much less satisfactory, the L2-error to the true 
state being of the order of 10 _1 (compared to 6 x 10~ 8 
for the unperturbed case). When a is further increased, 
the recovery of the a profile is poor, the error being of 
about 50% in this case. The final errors on the toroidal 
and poloidal magnetic fields are of the same order as the 
errors introduced on the assimilated data, which shows 
that the minimization is fundamentally successful. Nev- 
ertheless, even if the true state and the final fields de- 
part of the same amount from the perturbed observa- 
tions, the errors between them are still significant. For 
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Fig. 14. — Reconstructed a after assimilation of data perturbed 
with random noises with various standard deviations. 100 obser- 
vations were used (10 in time and 10 along the y-direction). 

a = 1CP 5 , the minimum L 2 - crror reached on B z is of the 
order of 4 x 10 -5 , about 6 orders of magnitude higher 
than the typical errors in similar situations using unper- 
turbed data. 

4. CONCLUSION 

We have presented the first attempt to apply varia- 
tional data assimilation techniques to the solar dynamo. 
A very simplified formulation was used, namely a linear 
deterministic ail dynamo model in Cartesian geometry, 
which should not be taken as an accurate representa- 
tion of the magnetic field regeneration and evolution in 
the Sun. Nevertheless, we showed that with this sim- 
ple model, variational data assimilation gives us a way 
to constrain various input parameters such as the pro- 
file of the a-effect, through the minimization of the er- 
rors to very few observations (140 observations at most 
were used, out of 30000 points in the (y,t) plane). With 
regularly-spaced observations, the variational technique 
enabled us to recover the profile of the a-effect at the 
accuracy of about 10 -8 , starting from an initial guess 
with an error of 10 -2 . This recovered a then produced 
magnetic fields in extremely good agreement (accuracy 
of around 10 -12 ) with the true state. 

Moreover, we showed that a partial knowledge of the 
toroidal field could give us useful information on the a- 
effect in the whole domain. Indeed, we showed that as- 
similating data in the latitudinal belt of activity (between 
—35° and 35°) is enough to reconstruct a at all latitudes 
with a final L2-error of 10~ 7 . We also showed that adding 
noise on the observations strongly perturbed the results 
of the minimization procedure, even if the global shape of 
the a-effect was mainly recovered in all cases (and espe- 
cially the antisymmetry about the equator). Finally we 
showed that the reconstruction of the a-effect in our toy 
model is difficult near the equator if the observed (gener- 
ated) data assimilated in the procedure are insensitive to 
variations in that region, as it was the case here with B z 
being zero. However, if we had considered an a 2 tt dy- 
namo model (with the a-effect present in the production 
of B z ), that comment might have not been true. This 
will be checked in future investigations. Other quanti- 
ties such as the differential rotation or observed variables 
such as the poloidal field could help better reconstruct- 
ing information in these specific locations. It may then 



be worth trying several combinations of quantities and 
variables in our attempt to better determine the internal 
dynamics of the Sun. 

The proof of concept presented in this work is very 
promising for the future developments of solar magnetic 
activity forecast. Indeed, we showed that if a physical 
model is assumed to be sufficiently close to reality, the 
knowledge of a very small piece of information could pro- 
vide us with the reconstruction of a very important phys- 
ical process for which direct measurements are not avail- 
able. More precisely, in the case of the Sun, if we assume 
that the meridional flow (large-scale flow in the merid- 
ional plane) plays a significant role in the evolution of 
the large-scale magnetic field (IDikpati fc Gilman 20061 : 
IJouve fc~ Brun 20071 : INandv et al. 20111) and hence in the 
dynamo loop, data assimilation could be very useful. In- 
deed, the meridional circulation is very difficult to mea- 
sure accurately, especially at depths higher than a few 
tens of Mm (see review of Miesch 2005 and recent obser- 
vations of Hathaway & Rightmire 2010). However, the 
magnetic field strength and configuration now start to 
be detected with great accuracy through new satellites 
as Hinode and SDO that provide vector magnetograms 
of the full solar disk. Data assimilation is then a way to 
link the direct measurements of, say, the radial field in 
active regions and a physical model in which the merid- 
ional flow takes part in the dynamo loop. It is the case 
for instance of flux-transport dynamo models which are 
sometimes used to model the whole evolution of large- 
scale magnetic fields in the Sun. Not only would some 
subtle physical processes (i.e. difficult to detect directly) 
be reconstructed through the assimilation of accurate ob- 
servations of more accessible variables, but we could then 
use the physical models to predict the behavior of the 
next solar cycle, with a different technique from what 
was used up to now. 

We said in the introduction that a reliable technique to 
predict future solar magnetic phenomena still does not 
exist, we propose here a way to progress in this direction, 
inspired by what has been used for a long time already in 
the Earth weather community. Of course, better physi- 
cal models and better understanding of the physical pro- 
cesses interacting in a star need to be developed before 
we can safely apply data assimilation techniques to give 
tentative predictions of the solar activity. In particular, 
the goal would be to assimilate observations of excellent 
quality (which are already available) in 3D MHD global 
solar dynamo models producing realistic magnetic cycles 
(which are not yet available). In the meantime, we try 
to progress step by step towards this goal and we think 
this work constitutes one of these steps, proving the pos- 
sibility to apply modern data assimilation techniques in 
solar physics. A next step could be to use a nonlinear 
dynamo model that is sensitive to the initial conditions 
and which uses polar coordinates rather than Cartesian 
ones. Finally, we could also introduce a so-called back- 
ground term in the cost function, which limits the depar- 
ture from an a prio ri estimate of the state vector (see 
iFournier et al. 20101 for further details). This allows to 
introduce data that is not contained in the observations 
such as information on the smoothness of the physical 
parameters (like the function a for example). We intend 
to do so in the near future. 
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APPENDIX 
THE ADJOINT INDUCTION EQUATION 

In this section, we present in details the different steps leading to the determination of the continuous adjoint mean- 
field induction equation. This is only of particular use for the development of the adjoint model but we find it useful 
to gain some insight on the link between adjoint operators and the calculations shown in this work. 

The velocity field v, magnetic diffusivity r\ and the a-effect are given functions of space and time. We show here 
how to compute the adjoint of each operator appearing in the equation. We first define the adjoint operator in the 
following manner: 

\f* is the adjoint of iff operating on the Euclidian space E if and only if 

V(ui,u 2 ) e E 2 , *(ui) • u 2 = ui • **(u 2 ) (Al) 

where • is the scalar product on E. As a consequence, in order to determine the adjoint of an operator, we need to 
find the operator such that condition lAll is fulfilled. 

Let Ui and u 2 be elements of the Euclidian space E. 

1. We first try to get the adjoint of the operator u > v x u. Let \&* be such that (v x ui) -u 2 = Ui ■ ^*(u 2 ). Then 
by manipulation of vector identities, we get: 

(v x u x ) • u 2 = -Ui • (v x u 2 ) (A2) 
The adjoint of u — s- v x u is thus u — v x u. 

2. Let us now look for the adjoint of u -> V x u. Let iff* be such that (V x Ui) • u 2 = Ui • \&*(u 2 ). 

(V x m) • u 2 = ui • (V x u 2 ) + V ■ (ui x u 2 ) (A3) 

The adjoint of u — > V x u is thus u — > V x u, the term V • (ui x u 2 ) representing a boundary term which will be 
used in the adjoint integration to test the sensitivity of the cost function to the boundary conditions for example. 

3. W e now determine the adjoint of u — > au Let 'J* be such that (aui) • u 2 = Ui • \f , *(u 2 ). It is straightforward 
to see that 

(mil) ■ u 2 = ui ■ (au 2 ) (A4) 
The adjoint of u — > au is thus u — >■ au (this operator is said to be self-adjoint). 

4. Finally, we need to get the adjoint of u — > du/dt. Let iff* be such that du^/dt ■ u 2 = Ui • ^*(u 2 ). We have: 

<9ui <9u 2 9(m • u 2 ) 

^ r -u 2 = -u 1 . — + (A5) 

The adjoint of u — > du/dt is thus u — > —du/dt, the term g f now representing an initial conditions term which 
could be used the adjoint integration to study the effect of the initial conditions on a particular cost function. 

We are thus able now to write the adjoint induction equation, using the property that the adjoint of a composition 
of operators is the compositions of the adjoint operators, taken in the reverse order. 
Finally, we have: 

$B 

— = vx(VxB)-aVxB + Vx (ryVxB) (A6) 
VARIATIONAL APPROACH 

In this section we will follow and adapt the procedure described in (jTalagran d 20031) . Let's consider the coupled 
induction equations [TOl and [111 for the fields A and B z . We search solutions of this set of equations over the rectangular 
domain D = [x 1 ,x 2 ] x [2/1,2/2] x [^1^2] hi (x,y,t)-space. These equations are first order with respect to t and second 
order with respect to x and y. 

Consider now a field B° bs (x , y , t) of observations over the domain D. Since we assimilate data only on the toroidal 
field (as a proxy of the surface sunspots) our cost function J is written: 

J(B) = \ [f( (B z - Bf s ) 2 dxdydt (Bl) 
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its variation is thus: 



5J 



(B z - Bf s )5B z dxdydt 



(B2) 



D 



We aim at expressing the variations of the cost function J to variations of our well-defined input parameters which 
are: 

- The values of A and B z for all points in space at the initial time t = t\ 

- The constant magnetic diffusivity rj 

- The function representing the a-effect a(x, y) 

- The azimuthal velocity function v(x,y) 

Let's derive the tangent linear equation obtain by differentiating equations [TOl and [TT1 with respect to A, B z , the 
parameter rj and the functions v and a and name respectively their variations 5 A, SB Z , Srj, 5v, 5a. The equations 
read: 

— - *aB. - oz5B z - ^ + ^) - ^ + w ) = (B3) 
0£B„ <9<5u<9,4 <9<5u<9,4 dvdM 3<ud<L4 



dt 



dx dy dy dx dx dy dy dx 
.8 2 5B Z 8 2 5B Z . .8 2 B Z 8 2 B Z „ 



9y 2 



9x 2 



(B4) 



Using Lagrange multipliers X(x,y,t) and ~f{x,y,t) respectively for equations IB3I and IB41 we get (introducing a 
negative sign for simplicity): 



5J = 



(B z - B° bs )5B z - A 



85 A 



foB z - a5B z - 77(^3-0- + -5-5-) 



S V (— + —) 
dx 2 dy 2 



dt dx 2 dy 2 

dSB z ddvdA 85v 8 A dv 85 A 8v 85 A 



Ot 



dx dy dy dx dx dy dy dx 



(B5) 



d 2 5B z d 2 5B z 
dx 2 dy 2 



Srj( 



d 2 B z 8 2 B Z 



dx 2 



dy 2 



dxdydt 



We now wish to remove all the differentiation operating on A, B z , n, v and a. To do so we use as many integration 
by parts as necessary, for the sake of clarity we demonstrate the procedure for a few typical terms: 



,95 A , , , 
A— — dxdydt 
dt 



X2 V2 t 

J J \5Adxdy 

D xi yi 

Diffusion terms require a double integration by parts: 

2/2 t 2 



ii 



8X 
dt 



5 A dxdydt 



D 



D 



, d 2 5A , , , 
A?; dxdydt - 

dx z 



85 A , , 
~8x~ 



Vl *l 

V2 t 2 



85A , , 
A77— — dydt 
dx 



D 

V2 t 2 



djXn) 85 A 
dx dx 

dx 



dxdydt 



5 A dydt 



Vl *l 



Vl ti 



d 2 {Xn) 
dx 2 



5 A dxdydt 



(B6) 



(B7) 



We note that the double integration modifies the sign of the diffusion term relative to the time derivative, this is 
expected since by going backward in time we should anti-diffuse. 



14 



Jouve et al. 



V2 t 2 

dvdSA , , , f f d« , , 

^ V -^ dxdvdt = -\ J V V 

D yi ti 

Terms involving variations of the ingredients are treated likewise: 

X2 *2 



D 



dAdSv , , , 
^^y- dxdydt 



"f-^—Sv dxdt 
ox 



(12 



Hi 



D 



d dv 

- {l -)SAdxdydt 



^(7^)^ dxdydt 



(B8) 



(B9) 



Applying systematically this method to all the terms that possess differentiation of the variations and grouping the 
terms by variations, we get for bj the following equation: 



SJ = 



dX d 2 X d 2 X d dv d dv 
dt + ^dx* + £hf> + cl? 7 cV ~ dy~ { - 1 dx~\ 



5A 



+ 



dj 
~di 



9 2 7 5 2 7 



(B z - Bf s ) 



SB Z 



+Sr] 



d 2 A 



dx 2 dy 



d 2 A^ ( 8 2 B Z 



dx 2 



d_ dA ) _d_ ( dA. 



X2 V2 



t- SaXB 

d 2 B z - 
dy 2 [ 

dxdydt 

t 2 
ti 



XSA dxdy 



xi Vl 



V2 t 2 



Vl *i 

X2 t 2 



dSA dX 



A- 



dx dx 



dSA dX 



SA 



- ^SA 



Xl ti 
X2 V2 



dy dy 



dv r A r dA i , - 
7—0^4 + 701;—— dydt 

dy dy 



dv . . . dA . , , 
+ 7— — dA — jov—— dxdt 
dx dx 



j5B z dxdy 



V2 ti 



Vl ti 

X2 *2 



95 B z _ fry 
dx dx 

d6B z 9 7 
dy dy 



dydt 



dxdt 



(BIO) 



This expression is valid for any X(x,y,t) and ^y(x,y,t). The space-time integral can be cancelled by imposing that 
A and 7 verify the following partial differential equations: 



dx 

~dt 



&X + (PX, 
dx 2 dy 2 ' 



d , dv^ 
dx dy 



— (7—) = 
dy dx 



(Bll) 



d-f , ,9 2 7 5 2 7, 



dt 



(B z -B 



obs \ 







(B12) 



We also impose that A(x, j/,^) an d 7(x,j/,t 2 ) equal zero for any x or y. Further since A and B z are constrained 
to be equal to zero at the boundary, their variation are zero, and all the terms involving either SA and 5B Z in the 
surface integrals above vanish. For the terms involving derivatives of 5 A and SB Z we impose that A and 7 equal zero 
on boundaries x = x\,X2 and y = y x,V2 for any t. 

We note that equations IB11I and IB12I with t he c ondit ions on A and 7 just stated above unambiguously define the 
functions in the whole domain. Equations IB1 II and IB12I are first-order in time and second-order in space, and define a 
well-posed problem for backward integration with respect to t, because of the positive sign of the diffusion terms. The 
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specification of A and 7 at the final time t2 and along the spatial boundaries Xi,X2, Ui and y<i therefore unambiguously 

define the functions \{x, y, t) and 7(3;, y, t). 

Taking into account these various conditions, equation IB10I reduces to: 



,1:2 112 



SJ 



8rj 



X(x, y, ti)6A(x, y, ti)dxdy 
d 2 A d 2 A 



X2 V2 X2 V2 

-f(x, y, tx)SB z (x, y, t{)dxdy + 



dx 2 dy 2 



)+7( 



xi yi 

d 2 B z 8 2 B Z 



t2 

6 a J \B z dt 



dxdy 



dx 2 



dy 2 



dxdydt 



5v 



— (7—) - — (7— ) 
dx dy dy dx 



dxdydt (B13) 



The above equation reveals that the partial derivatives of the objective function J with respect to 77, a(x, y), v(x, y), 
A(x,y,ti) and B z (x,y,ti) are equal to: 



dj 



dj 



— (x,y,ti) = X(x,y,t 1 ) and — -(x, y, t x ) = j(x, y, h) 



dA 



dj 

drj 



dB z 

x/ d 2 A d 2 A x ,d 2 B z 
K— + — ) + 7(- ^ 



dx 2 



dy 2 



dx 2 



d 2 B z 
dy 2 



dxdydt 



(B14) 
(B15) 



dj_ 

da 



dj 

dv 



— = / \B Z dt ,M(x,y) 



to 



d_ dA d_ dA. 



dt ,y(x,y) 



(B16) 



(B17) 



We here solve a simplified problem by considering that A and B z at t = t\ are known (seelTalagran d fc Courtier 19871 
for discussions about sensitivity to initial conditions). 
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